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Abstract — This  paper  addresses  the  problem  of  goal-directed 
robot  path-planning  in  the  presence  of  uncertainties  that  are 
induced  by  bounded  environmental  disturbances  and  actuation 
errors.  The  offline  infinite-horizon  optimal  plan  is  locally  updated 
by  online  finite-horizon  adaptive  re-planning  upon  observation 
of  unexpected  events  (e.g.,  detection  of  unanticipated  obstacles). 
The  underlying  theory  is  developed  as  an  extension  of  a  grid- 
based  path  planning  algorithm,  called  is* ,  that  was  formulated 
in  the  framework  of  probabilistic  finite  state  automata  (PFSA) 
and  language  measure  from  a  control-theoretic  perspective.  The 
proposed  concept  has  been  validated  on  a  simulation  test  bed  that 
is  constructed  upon  a  model  of  typical  autonomous  underwater 
vehicles  (AUVs)  in  the  presence  of  uncertainties. 

1 .  Motivation  and  Introduction 

In  general,  path  planning  of  robots  (e,g„  autonomous  under¬ 
water  vehicles  (AUVs)  and  unmanned  aerial  vehicles  (UAVs)) 
aims  to  optimize  either  travel  time,  energy  usage,  or  safety  of 
operations.  Recently,  much  work  has  been  reported  for  path 
planning  in  the  presence  of  environmental  uncertainties.  A  few 
examples  follow. 

Garau  et  al.  [1]  used  the  A*  algorithm  for  path  planning 
of  AUVs  by  taking  the  effects  of  ocean  currents  into  con¬ 
sideration.  Petres  et  al.  [2]  reported  path  planning  of  AUVs, 
in  which  a  fast  marching  (FM)  algorithm  was  used  to  model 
the  effects  of  ocean  currents  as  an  anisotropic  cost  function. 
Rhoads  et  al.  [3]  solved  a  dynamic  Hamilton  Jacobi  Bellman 
(HJB)  equation  to  obtain  the  optimal  ”time-to-go“  with  a 
discontinuous  and  dynamic  cost  function  for  path  planning 
in  the  presence  of  ocean  currents.  Recently,  Lolla  et  al.  [4] 
developed  a  methodology  for  AUV  path  planning,  in  which 
the  concept  of  flow  advection  was  combined  with  nominal 
vehicle  motion  until  the  desired  goal  was  reached.  Majumdar 
et  al.  [5]  generated  libraries  in  the  off-line  pre-computation 
stage  for  online  robust  motion  planning,  where  some  of  the 
library  members  were  regenerated  for  collision  avoidance. 
Blackmore  et  al.  [6]  designed  a  chance-constrained  predictive 
stochastic  controller  to  ensure  that  the  probability  of  failure 
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(e.g.,  collision)  is  less  than  a  specified  threshold  even  in  the 
presence  of  randomly  varying  uncertainties.  Chakravorty  and 
Kumar  [7]  used  the  concepts  of  probabilistic  roadmaps  (PRM) 
and  rapidly  exploring  random  trees  (RRT)  [8]  to  construct 
feedback  controllers  for  robust  planning  in  the  presence  of  mo¬ 
tion  uncertainties.  However,  in  many  path  planning  algorithms 
reported  in  literature,  the  offline  computation  does  not  take 
into  account  potential  runtime  constraints  (e.g.,  unanticipated 
obstacles),  and  online  adaptation  is  computationally  infeasible. 

Robots  (e.g.,  AUVs)  are  expected  to  operate  in  environ¬ 
ments  with  external  disturbances  and  hence,  considering  the 
effects  of  these  disturbances  is  critical  for  mission  success. 
Consequently,  the  algorithms  of  path  planning  must  be  capable 
of  real-time  execution  on  in-situ  computational  platforms.  To 
meet  these  challenges,  Chattopadhyay  et  al.  [9]  formulated 
the  is*  algorithm  in  the  framework  of  probabilistic  finite  state 
automata  (PFSA)  from  a  control-theoretic  perspective.  The  is* 
algorithm  performs  robot  path  planning  for  a  specified  goal  by 
optimizing  the  language  measure  of  the  PFSA  model  [10]  [11] 
that  represents  the  workspace  for  the  robot.  In  this  setting, 
optimal  path  planning  is  equivalent  to  maximization  of  a 
PFSA’s  performance,  based  on  a  quantitative  measure  of  prob¬ 
abilistic  regular  languages.  This  concept  of  robot  path  planning 
jointly  maximizes  the  probability  of  reaching  the  target  and  the 
probability  of  staying  clear  of  the  obstacles.  Such  a  notion  of 
goal-directed  safe  (e.g.,  Pareto  optimization  [12]  of  reaching 
the  goal  and  collision  avoidance)  navigation  of  autonomous 
vehicles  is  particularly  important  in  the  context  of  planning 
in  the  presence  of  disturbances,  especially  to  ensure  collision- 
free  navigation.  In  this  case,  Markov  decision  process  (MDP) 
tools  may  not  be  suitable  because  of  the  requirement  of  robust 
adaptation  in  real  time  [5].  The  situation  becomes  worse  for 
lack  of  observability,  because  the  formulation  of  a  partially 
observable  Markov  decision  process  (POMDP)  would  become 
computationally  intractable  for  a  real-time  solution. 

Language-measure-theoretic  path  planning  offers  an  inher¬ 
ent  advantage  of  global  monotonicity  [9]  in  the  sense  that 
the  solution  iterations  are  finite  and  that  a  sequence  of  final 
waypoints  is  generated.  The  PFSA,  constructed  out  of  the 
robot’s  workspace,  is  optimized  via  an  iterative  sequence  of 
combinatorial  operations  to  maximize  the  language  measure 
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vector  elementwise.  Nevertheless,  the  v*  algorithm  is  different 
from  a  conventional  search  algorithm  in  the  sense  that  it  auto¬ 
matically  generates  a  measure  gradient  (e.g.,  no  potential  func¬ 
tion  required),  which  is  maximized  at  the  goal.  Furthermore, 
the  time  complexity  of  each  iteration  step  is  linear  relative  to 
the  problem  size  (e.g.,  the  dimension  of  the  discretized  state 
space),  and  computation  of  the  measure  vector  in  a  distributed 
fashion  makes  the  algorithm  suitable,  especially  for  high¬ 
dimensional  planning.  These  advantages  render  the  presented 
method  potentially  suitable  for  (offline)  planning  and  (online) 
adaptation  in  the  presence  of  uncertainties. 

As  an  extension  of  the  u*  algorithm  [9],  this  paper  formu¬ 
lates  an  algorithm  of  grid-based  robot  path  planning,  which 
has  been  validated  on  a  simulation  test  bed  of  autonomous 
underwater  vehicles  (AUVs)  in  the  presence  of  uncertain 
ocean  currents  and  actuation  errors.  Major  contributions  of 
the  paper  beyond  the  work  on  the  u*  algorithm,  reported  by 
Chattopadhyay  et  al.  [9],  are  outlined  below. 

•  The  notion  of  uncertainty  is  associated  with  the  uncon¬ 
trollable  transitions  in  a  controlled  autonomous  system. 
This  concept  is  different  from  the  common  practice  of 
superimposing  the  effects  of  disturbances  on  the  planned 
trajectory  of  a  robotic  system. 

•  The  offline  infinite-horizon  optimal  plan  is  locally  up¬ 
dated  by  online  finite-horizon  adaptive  re-planning  upon 
observation  of  unexpected  events  (e.g.,  detection  of  unan¬ 
ticipated  obstacles). 

2.  Brief  Review  of  Language  Measure  Theory 

This  section  summarizes  the  concept  of  (signed  real)  lan¬ 
guage  measure  [10]  of  probabilistic  finite  state  automata 
(PFSA)  [13]  and  the  role  of  language  measure  for  optimal 
control  [11]  of  PFSA.  While  the  theories  of  language  measure 
and  the  associated  optimal  control  are  developed  in  [10] [11], 
this  section  introduces  pertinent  definitions  and  summarizes 
the  essential  concepts  that  are  used  in  the  sequel. 

Definition  2.1  (Alphabet  and  Word)  An  alphabet  E  is  a 
nonempty  finite  set  of  symbols.  A  finite-length  string  of  symbols 
from  E  is  called  a  word  and  the  empty  word  (i.e„  a  string  of  no 
symbols)  is  denoted  as  e.  A  word  w  of  £  symbols  has  length 
|iu|  =  £  and  |e|  =  0.  The  set  of  all  words  (including  e)  is 
denoted  as  E*;  the  set  of  all  words  of  length  £  is  denoted  as 
Yf ;  and  the  set  of  all  infinite  strings  of  symbols  is  denoted  as 
Ew. 

Referring  to  Definition  2.1,  the  cardinalities  of  the  following 
sets  are  noted  below, 

•  | E |  G  N,  where  N  is  the  set  of  positive  integers. 

•  | E^ |  =  |E|£  for  any  positive  integer  £. 

•  j E* |  =  |N|,  i.e.,  E*  is  countably  infinite  (H0). 

•  |E^|  =  |R|,  i.e.,  E1^  is  uncountably  infinite  (continuum). 

Definition  2.2  (Prefix  and  Suffix)  Let  E  be  an  alphabet.  The 
operation  of  concatenation  of  two  words  is  closed  on  E*,  i.e.. 


if  w  =  xy,  then  w  G  E*  \/x,  y  G  E*;  and  e  is  the  identity 
element  of  the  concatenation  monoid  [14],  In  that  case,  x  is 
called  a  prefix  of  w,  and  y  is  called  a  suffix  of  w. 

Definition  2.3  (PFSA)  A  probabilistic  finite  state  automaton 
(PFSA)  over  an  alphabet  E  is  a  quintuple 

<G  =  (Q,  E,  5, 7r,  x),  where 

«  Q  is  the  nonempty  finite  set  of  states,  i.e.,  \Q\  G  N; 

«  S  :  Q  x  E*  —>  Q  is  the  transition  map  that  satisfies  the 
following  conditions :  \/q  G  Q  Vs  G  E  G  E* 

S(q ,  e)  =  q;  and  S(q ,  ws)  =  5(5(q,  w),  s). 

«  7T  :  Q  X  E*  — >  [0, 1]  is  the  morph  function  of  state- 
specific  symbol  generation  probabilities,  which  satisfies 
the  following  conditions:  \/q  G  Q  Vs  G  E  \/w  G  E* 
t r(<7,  s)  >  0;  £sG£  7r(?.  s)  =  and 
7r(g,  e)  =  1;  7 r(q,  ws)  =  n(q,  w)  x  tr(d(q,  w),  s). 

•  X  :  Q  ->■  [-1,1]'QI  is  the  vector-valued  characteristic 
function  that  assigns  a  signed  (normalized)  real  weight 
to  each  state. 

The  (|Q|  X  |Q|)  state  transition  probability  matrix  P  is 
defined  as 

P  =  [Pjk\,  where  Pjk  =  ^  tt(Qj,s)  Vqj,qk  £  Q 

«eS:  8{qj,s)=qk 

Note:  P  is  a  non-negative  stochastic  matrix  [15]. 

Remark  2.1  The  restrictions  <5|qX£  — >  Q  and  tt|qx£  — > 
[0, 1]  of  the  functions  5  and  7 r  in  Definition  2.3  can  be  directly 
obtained  from  the  local  structure  of  the  PFSA.  The  unrestricted 
functions  5  and  7 r  can  then  be  computed  over  the  entire  domain 
by  using  the  recursive  relations  listed  in  Definition  2.3.  Physi¬ 
cal  interpretations  of  %  are  briefly  provided  in  Subsection  2- A 
while  the  details  are  reported  in  Definition  5  of  [10]. 

Definition  2.4  (Language)  Let  (().  E,  6. 7r,  x)  be  a  PFSA.  The 
language  generated  by  all  words,  which  terminates  at  a  state 
qk  G  Q  after  starting  from  qj  G  Q,  is  defined  as 

L(qj,qk)  =  {w  €Y*  :  6{qj,w)  =  qk}  (1) 

The  language  generated  by  all  words,  which  may  terminate  at 
any  state  q  G  Q  after  starting  from  qj  G  Q,  is  defined  as 

ife)  -  IJ  (2) 

q<ZQ 

Given  a  PFSA  (Q,  E,  <5,  n,  x),  the  following  facts  are  con¬ 
sidered  before  the  notion  of  a  measure  is  introduced  on  the 
language  L  (see  Definition  2.4). 

•  The  language  L  is  an  at  most  countable  (i.e.,  finite  or 
countably  infinite)  set  because  L  C  E*.  Therefore,  the 
power  set  2L  is  either  finite  or  uncountable;  if  uncount¬ 
able,  the  cardinality  of  2L  is  the  same  as  that  of  R  or 
that  of  the  standard  Borel  algebra  23(R)  [16]. 

•  A  measurable  space  ( L ,  YU12L)  can  be  constructed  where 
every  singleton  set  of  a  semi-infinite  string  of  symbols. 
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whose  suffix  is  a  word  in  L,  is  a  measurable  set  [16],  [10]. 
For  brevity,  the  u-algebra  T,U2L  is  denoted  as  2L ,  because 
every  word  w  £  L  is  prefixed  by  a  semi-infinite  string  of 
symbols.  That  is,  w  represents  xw  and  { w }  £  2L  implies 
{a?u>}  £  E“2L,  where  x  £  E“  is  arbitrary. 

Definition  2.5  (Language  Measure)  Let  L(qj,qk)  and  L(qj) 
be  languages  on  a  PFSA  (Q,  E,  <5,  tt,  x)  afid  let  9  £  (0, 1)  be 
a  parameter.  A  signed  real  measure  pJek  :  2r^qj'qk'1  — >  JR.  (that 
satisfies  the  requisite  axioms  of  measure  [16])  is  defined  as 

dek(L(lj,qk))  =  -  0)lwltt(qj,w)x(qk)  (3) 

ir&L(qj,qk) 

The  measure  of  the  language  L(qj)  is  defined  as 

~  Pek(L(9j^k))  (4) 

qk&Q 

Following  Definition  2.5,  the  set  of  language  measures  for 
a  PFSA  is  interpreted  as  a  real-valued  vector  of  dimension 
|Q |  and  is  denoted  as  vg  whose  jth  component  is  u°e  £  R 
corresponding  to  the  state  qj  £  Q.  Following  Theorem  1 
in  [10],  the  language  measure  of  the  PFSA  (Q,  E,  <5,  tt,  x)  in 
Eq.  (2.5)  is  expressed  vectorially  as 

is9  =  e[i-(i-e)F}~1x  (5) 

where  P  is  the  state  transition  matrix  (see  Definition  2.3) 
and  the  inverse  on  the  right  side  exists  for  all  9  £  (0,1). 
Furthermore,  as  9  — >  0+,  the  matrix  9[ I  —  (1  —  6>)P] 
converges  to  the  Cesaro  matrix  T7  =  lim^oo  j:  )C;=o 
Then,  the  limiting  measure  vector  u{]  is  obtained  as  [11] 

i/0  =  lim  ue=  lim  0[I  —  (1  —  0)P]_1x  =  V  X  (6) 
e^o+  e->o+ 

where  I  is  the  (|Q|  x  |Q|)  identity  matrix. 

A.  Control  Perspectives 

From  the  perspectives  of  discrete-event  optimal  control  [11], 
the  limiting  language  measure  vector  i>q  in  Eq.  (6)  is  physi¬ 
cally  interpreted  as  follows.  The  language  measure  of  a  symbol 
string  starting  at  a  state  qj  £  Q  is  interpreted  to  be  the  product 
of  the  morph  probability  (conditioned  on  the  state  qj)  and 
the  characteristic  weight  \k  of  the  terminating  state  ///,. .  For 
example,  if  qp,  is  a  desirable  state  (e.g.,  goal)  to  terminate, 
then  Xk  should  be  assigned  to  be  positive;  similarly,  \k 
should  be  negative  for  an  undesirable  terminating  state  (e.g., 
obstacle)  q k.  Thus,  the  characteristic  weights  are  assigned 
to  represent  the  control  specifications  (i.e.,  larger  positive 
weights  to  more  desirable  states)  and  the  language  measure 
represents  the  goodness  of  the  particular  string  relative  to  the 
given  specification  and  the  PFSA  model  [10].  In  the  sense  of 
Definition  2.5,  the  limiting  language  measure  i/q  sums  up  to 
the  limiting  measures  of  each  string  starting  from  state  qj. 
Thus,  z/q  captures  the  goodness  of  qj  based  on  not  only  its 
own  characteristic  weight,  but  also  on  how  good  are  the  strings 
that  will  be  generated  from  qj  in  the  future.  In  essence,  the 


language  measure  provides  a  quantitative  comparison  of  viable 
control  policies  [11], 

Definition  2.6  ( Control  Philosophy)  Let  qj  — >  qk  be  the  state 
transition  under  occurrence  of  the  event  s.  If  the  event  s  is 
disabled  at  qj  by  a  supervisory  action,  then  this  state  transition 
from  qj  to  qk  will  be  prevented  by  forcing  the  plant  to  stay  at 
the  original  state  qj.  Thus,  disabling  an  event  s  at  a  given  state 
q  results  in  deletion  of  the  original  transition  and  appearance 
of  the  self-loop  S(q,  s)  =  q  with  the  probability  of  occurrence 
of  s  at  state  q  remaining  unchanged  in  the  supervised  plant 
and  also  in  the  unsupervised  plant. 

Definition  2.7  (Controllable  Transitions ):  For  a  given  plant, 
transitions  that  can  be  disabled  in  the  sense  of  Definition  2.6 
are  defined  to  be  controllable;  a  transition  that  is  not  con¬ 
trollable  is  called  uncontrollable.  The  set  of  controllable 
transitions  in  a  plant  is  denoted  as  C. 

If  the  supervisor  is  allowed  to  disable  all  subsets  (including 
the  empty  set  0  and  the  set  C  itself)  of  the  set  C  of  controllable 
transitions,  then  there  exists  a  bijection  between  the  set  of 
all  possible  supervision  policies  and  the  power  set  2C;  in 
other  words,  there  are  2 !  c  possible  supervisors  and  each 
supervisor  is  uniquely  identifiable  with  a  subset  of  C.  Thus, 
the  language  measure  v  allows  a  quantitative  comparison  of 
different  policies. 

Definition  2.8  (Controlled  PFSA  and  Optimal  Measure  u*) 
An  optimally  controlled  (or  supervised)  PFSA  §  = 

(Q,  E,  S,  7 r,  Xi  C)  is  a  sextuple  that  maximizes  its  language 
measure  Vq  by  disabling  a  selected  subset  of  C.  The  optimized 
measure  is  denoted  as  u*. 

3.  Description  of  the  Simulation  Test  Bed 

This  section  describes  the  simulation  test  bed  that  is  built 
upon  a  rigid-body  dynamic  model  of  the  planar  motion  of 
a  generic  autonomous  underwater  vehicle  (AUV)  [17].  The 
rationale  for  adopting  such  a  simple  model  is  that  the  usage 
of  the  proposed  language-measure-theoretic  concepts  can  be 
unambiguously  presented  to  explain  how  to  control  the  AUV 
motion  in  the  presence  of  environmental  disturbances  and 
actuation  errors.  Nevertheless  the  theory  is  applicable  on  a 
dynamically  feasible  graph  with  any  degree  of  details. 

The  model  takes  into  account  the  effects  of  yaw  (tit)  and 
the  velocity  vector  of  surge  rate  (it),  sway  rate  ( v ),  and  yaw 
rate  (r),  i.e,  v  =  [u  v  r]  ;  but  the  effects  of  heave  ( z ), 
roll  (cf>)  and  pitch  (9)  are  assumed  to  be  negligible  [17].  The 
resulting  equations  of  motion  are  obtained  as 

Mv  +  C(v)v  +  D(v)v  =  r  and  tp  =  r  (7) 

where  M  =  MT  is  the  inertia  matrix;  C(v)  =  —  C(v)T  is 
the  Coriolis  centripetal  matrix;  D(v)  is  the  damping  matrix; 
and  r  =  [r„  tv  tv]  is  the  vector  of  the  system  inputs 
that  represent  the  forces  generated  by  the  on-board  actuators 
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as  well  as  those  produced  by  environmental  disturbances  (e.g., 
ocean  current).  In  this  model,  control  actions  are  generated  as 
functions  of  the  surge  rate  ( u )  and  yaw  rate  (r)  only,  i.e.,  the 
vehicle  control  input  is  restricted  to  be  Tcon  =  \tu  0  rr]  . 
Surge  and  sway  components  of  the  ocean  current  vector  vc 
are  modeled  in  terms  of  its  direction  /3  and  yaw  angle  ip  as 

uc  =  |vc|  cos(/3  -  ip)  and  vc  =  |vc|  sin(/3  -  ip)  (8) 

Assuming  that  the  vehicle’s  onboard  controller  maintains 
the  desired  yaw  angle  ip  during  its  course  of  motion  (i.e., 
letting  ip  =  r  =  0),  the  dynamic  motion  model  is  further 
simplified  to  obtain  the  travel  distance  F  and  G  along  the 
surge  and  sway  directions  [17],  respectively,  as 

F(t„,mc)=  f  u(t)dt  and  G(vc)  =  f  v(t)dt  (9) 
J  to  J to 

4.  Problem  Formulation 

This  section  formulates  the  problem  of  path  planning  for 
planar  motion  of  robots  in  the  presence  of  uncertainties 
induced  by  environmental  disturbances,  where  the  workspace 
is  partitioned  as  a  finite  number  of  cells.  Major  assumptions 
in  the  problem  formulation  are  listed  below. 

1)  Partial  knowledge  of  the  static  obstacles 

2)  No  moving  obstacles. 

3)  Partitioning  of  the  workspace  into  cells  as  a  regular  grid. 

4)  Bounded  disturbances  and  actuation  errors  with  the 
errors  of  PFSA  states  being  limited  to  neighboring  cells 
of  the  origin  cell. 

Depending  on  the  resolution  of  planning,  a  cell  may 
represent  a  single  location  or  a  sub-region  in  the  planning 
environment.  The  neighborhood  of  a  cell  is  defined  by  taking 
into  account  feasible  dynamics  of  transitions  to  that  cell. 

The  robot  motion  is  modeled  as  controllable  transitions 
among  the  cells  as  the  robot  chooses  to  execute  its  moves;  such 
moves  may  depend  on  the  fidelity  of  motion  discretization  and 
the  robot’s  intrinsic  dynamics.  Based  on  the  specifications  of 
inter-cell  transitions,  each  cell  in  the  partitioned  workspace  is 
modeled  as  a  PFSA  state,  where  controllable  transitions  are 
defined  by  the  corresponding  state  transition  map. 

Let  a  PFSA  Gjvav  =  (Q,  £,  5, 7r,  x)  (see  Definition  2.3) 
represent  the  navigation  model  of  a  robot,  where  the  states  in 
Q  are  cell  locations  in  the  original  workspace  and  the  alphabet 
size  |£|  is  determined  from  the  number  of  events  that  may 
cause  controllable  transitions.  For  simplicity  of  exposition, 
this  paper  considers  a  planar  robot  that  is  allowed  to  freely 
rotate  about  its  geometric  center,  implying  that  |£|  =  9.  As 
an  example,  let  the  robot  be  in  the  state  q  (see  Fig  1)  and  then 
let  the  robot  make  a  choice  to  move  to  one  of  the  states  labeled 
1  through  8  or  it  may  stay  at  q.  While  the  state  transition  map 
S  (restricted  to  Q  x  £)  is  constructed  in  the  neighborhood  of 
a  PFSA  state  and  the  morph  function  rr  (restricted  to  Q  x  £) 
are  constructed  under  the  condition  that,  with  negligible  state 
estimation  error,  the  robot  can  choose  any  controllable  transi¬ 
tion  to  execute  at  a  cell.  Hence,  all  controllable  transitions  at 


Fig.  1.  Illustration  of  the  effects  of  uncontrollable  transitions  due  to  the 
disturbances.  It  is  shown  how  uncontrollable  transitions  may  cause  the  robot 
to  end  up  in  a  different  neighboring  state  of  the  origin  cell,  thus  interfering 
with  control  actions. 

a  cell  are  assumed  to  be  equally  likely  for  the  unsupervised 
automaton  Gnav ■  The  vector- valued  characteristic  function 
X  (see  Definition  2.3)  is  chosen  based  on  a  proiri  known 
goodness  of  individual  states,  explained  in  Subsection  2-A. 
Since  the  environment  map  could  be  only  partially  known, 
the  robot  must  adapt  to  the  unforeseen  obstacles  in  real  time. 

Uncertain  (e.g.,  fault-induced  and  environmental)  distur¬ 
bances  may  cause  uncontrollable  transitions  in  the  robot 
motion  as  seen  in  the  example  of  Fig.  1,  where  the  robot  may 
erroneously  move  to  an  unplanned  state,  thus  interfering  with 
control  actions  of  robot  path  planning  (see  Assumption  #4 
at  the  beginning  of  this  section).  Therefore,  probabilistic 
decisions  of  path  planning  need  to  be  made  based  on  the  set  of 
the  available  controllable  actions  of  the  robot  to  mitigate  the 
effects  of  these  disturbances.  In  this  model,  the  state  transition 
probabilities  serve  as  a  discretized  representation  of  the  effects 
of  disturbances  on  the  robot  dynamics,  where  the  transition 
probability  P(gj  |g,;;  s)  represents  the  probability  of  reaching 
state  qj  after  executing  an  action  s  from  state  qi.  Using  the 
simplified  model  developed  in  the  last  section,  the  vehicle’s 
relative  position  in  the  next  time  step  (from  the  previous 
position  after  it  makes  a  move  s  £  £  in  the  global  Cartesian 
coordinate  frame)  is  expressed  as 

Ax  =  F  cos(ip)  —  G  sin(^)  (10) 

Ay  =  F  sin(ip)  +  G  cos(ip)  (11) 

The  corresponding  transition  probabilities  P  under  the  spec¬ 
ified  simplifying  assumptions  are  then  computed  in  terms  of 
the  probabilities  V  as: 

=V{[Ax,A y]  &qj\qi,s)  (12) 

In  the  absence  of  disturbances,  P (qj\qi,s)  is  either  0  or  1 
such  that  P(<5(g,,  s)\qi,  s)  =  1  and  P(gj|g*,s)  =  0  Wqj  £ 
Q\5(qi,s). 

While  the  PFSA  Gnav  specifies  the  discretized  motion  of 
the  planar  robot,  the  matrix  P  contains  information  about  the 
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effects  of  the  disturbances  on  its  motion.  With  the  knowledge 
of  &nav  and  IP\  the  path  planning  problem  is  reduced  to 
the  optimal  supervision  problem  of  &nav  (in  the  sense  of 
Definition  2.8)  in  the  presence  of  disturbances  specified  by  P. 
Detection  of  unforeseen  obstacles  during  the  robot  operation 
may  result  in  a  change  of  the  characteristic  weight  vector  X- 
In  this  context,  new  obstacles  are  considered  as  undesirable 
terminating  states  and,  therefore,  are  assigned  negative  char¬ 
acteristic  weights  (see  Subsection  2-A). 

5.  Solution  Approach 

Formulation  of  the  path-planning  problem  as  a  PFSA-based 
navigation  allows  computation  of  optimally  feasible  paths  via 
the  language-measure-theoretic  optimization  scheme.  For  the 
unsupervised  model,  the  robot  is  free  to  choose  from  any  of 
the  defined  controllable  events  from  any  cell  in  the  workspace. 

The  optimization  algorithm,  presented  as  Algorithm  1,  se¬ 
lectively  disables  controllable  transitions  to  ensure  that  the 
measure  vector  of  the  navigation  automaton  is  elementwise 
maximized.  Selected  controllable  transitions  are  disabled  to 
optimizes  the  state  transition  probability  matrix  P  (see  Defini¬ 
tion  2.3).  This  is  accomplished  by  maximizing  the  limiting 
measure  vector  (see  Eq.  (6))  in  the  sense  of  Definition 
2.8,  which  corresponds  to  the  optimally  supervised  PFSA 
model  for  the  robot.  This  implies  that  the  supervised  robot 
is  constrained  only  to  choose  among  the  enabled  moves  at 
each  state  such  that  a  Pareto  trade-off  is  achieved  between 
minimization  of  the  collision  probability  and  simultaneous 
maximization  of  the  goal-reaching  probability.  An  important 
point  to  note  is  that  even  though  the  measure  values  are  based 
on  optimization  of  a  probabilistic  finite  state  automata,  an 
optimal  and  feasible  plan  is  obtained  that  can  be  executed 
in  a  deterministic  sense. 

For  planning  in  the  presence  of  environmental  disturbances, 
it  is  critical  to  consider  the  robot’s  vulnerability  to  uncertain¬ 
ties  created  by  such  disturbances  (see  Fig.  1).  In  the  work 
presented  in  [11] [9],  the  measure  vector  was  optimized  by 
ignoring  the  uncontrollable  transitions  arising  from  the  uncer¬ 
tainties.  However,  ignoring  the  uncontrollable  transitions  in  the 
presence  of  disturbances  could  be  fatal  for  autonomous  robots. 
In  this  paper,  the  effects  of  the  uncontrollable  transitions  are 
taken  into  consideration  while  optimizing  the  measure  vector 
of  the  navigation  automaton.  In  this  framework,  a  combinato¬ 
rial  sequence  of  enabling  and  disabling  of  transitions  optimizes 
a  weighted  combination  of  minimizing  the  probability  of  col¬ 
lision  and  maximizing  the  probability  of  reaching  the  goal  in 
the  presence  of  known  environmental  disturbances.  Depending 
on  the  strength  of  environmental  disturbances,  the  controller 
attempts  to  make  an  optimal  trade-off  between  performance 
(i.e.,  path  length)  and  robustness  to  uncertainties. 

With  Monte  Carlo  simulation  of  the  disturbances  on  the 
test  bed,  conditional  probabilities  for  the  robot  (i.e.,  AUV)  to 
terminate  at  a  neighboring  state  qj  are  generated  as 

IPfekus)  such  that  E  ¥(qj\qi,s)  =  l  (13) 

qjeN(qi) 


where  qj  G  Q  is  a  neighboring  state  of  </,  and  s£S  represents 
the  control  action. 

Let  CJnav  =  (Q,  <5, 7r,  x)  be  the  unsupervised  navigation 

automaton.  From  the  knowledge  of  the  morph  probability 
7 r  and  the  conditional  probability  distribution  P  due  to  the 
environmental  disturbances,  a  family  of  probability  matrices, 
PgO,  -,j),  J  =  1,2,---  ,  |N(gj)|  is  constructed,  where  \N(qi)\ 
is  cardinality  of  the  set  N(qi)  of  neighborhood  states  (includ¬ 
ing  itself)  of  the  state  qi  (e.g.,  |IV(gj)|  =  9  for  any  internal 
state  of  a  planar  robot),  defined  as 

PG(qi,s,qj)  =  P(qj\qi,s)  x  7r(g,;,s)  Vg.eQ  (14) 

where  s  £  E  is  a  control  action  and  <5(g,,s)  =  qj  is  a 
neighboring  state  (including  itself)  of  qi.  (used  in  line 

24  of  Algorithm  1)  is  a  local  measure  vector  of  state  qi  and 
consists  of  the  measure  values  of  the  states  qj  £  N(qi).  The 
transition  matrix  Pg  is  optimized  in  an  iterative  fashion  by 
Algorithm  1  such  that  the  strings,  which  are  likely  to  lead  to 
collision  in  the  presence  of  the  disturbances,  are  disabled.  At 
the  same  time,  feasible  paths  to  the  goal  are  retained  so  that 
the  measure  vector  is  elementwise  maximized. 

For  path  planning  in  a  large-dimensional  state  space,  a 
distributed  iterative  scheme  is  recommended  for  the  compu¬ 
tation  of  the  language  measure,  which  was  reported  in  an 
earlier  publication  [18].  The  algorithm  of  optimized  language 
measure  vector  is  presented  as  Algorithm  1. 

Salient  properties  of  Algorithm  1  are  delineated  below. 

•  Computation  of  the  uncertain  behavior  of  the  robot’s 
motion  caused  by  environmental  disturbances :  The  se¬ 
quence  of  disabling  and  enabling  actions  at  the  robot’s 
discretion  is  generated  by  (elementwise)  maximization  of 
the  language  measure  vector.  To  relax  the  assumption  of 
restriction  of  uncontrollable  transitions  to  the  neighboring 
states,  a  probability  distribution  over  the  entire  workspace 
could  be  considered  instead  of  using  a  local  measure 
vector  in  line  24  of  Algorithm  1.  This  is  also  true  in 
trajectory  planning  for  UAVs,  where  the  environmental 
disturbances  may  require  widening  of  its  flight  envelope. 

•  Generation  of  the  optimized  measure  vector  u*  for  robot 
navigation  based  on  its  PFSA  model,  along  with  the 
associated  optimized  outputs  7 r*  and  Pf .  The  path  to 
the  goal  from  arbitrary  initial  locations  is  obtained  by 
following  the  gradient  of  measure  vector. 

Real-time  obstacle  avoidance  in  the  presence  of  disturbances 
is  a  difficult  problem  due  to  the  disturbance-induced  uncon¬ 
trollable  transitions.  An  online  re-planning  algorithm  must 
propagate  the  motion  model  in  time  and  make  sure  that  the 
robot  is  clear  of  the  obstacle.  In  this  regard,  one  may  take 
advantage  of  the  fact  that  the  stochastic  navigation  matrix 
(i.e.,  the  optimized  state  transition  matrix  of  the  controlled 
PFSA)  contains  the  information  on  the  disturbances.  To  make 
a  local  adaptation,  the  original  infinite  horizon  measure  vector 
is  augmented  with  the  information  on  recently  discovered  ob¬ 
stacles.  The  effects  of  this  obstacle  are  considered  on  a  finite- 
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(a) 


(b) 


(c) 


Fig.  2.  Optimal  paths  under  different  directions  of  the  ocean  current.  (The  vehicle  is  constrained  to  stay  in  the  box  and  not  hit  the  walls.) 

those  strings  that  reach  the  goal  via  the  newly  detected  obstacle 
states  from  the  language  generated  by  any  state  of  the  PFSA. 
The  measure  vector  is  consequently  updated  by  removing 
those  strings  from  the  original  optimal  language  generated  by 
the  states  of  the  PFSA,  i.e.,  by  subtracting  the  measure  of 
such  strings  (see  Definition  2.5)  from  the  original  language 
measure.  The  resulting  path  improves  monotonically  in  the 
re-planning  time  horizon  because  the  estimate  of  the  measure 
of  such  strings  improves  monotonically. 


Algorithm  1  Distributed  updating  of  measure  vector  for  planning 
in  the  presence  of  disturbances 

Require: 

Ensure:  v 
1 


S,5,7T,  x),0,Pg 
Initialize  V  q*  E  Q,  =  0. 


p  dum 


=  pg 

=  7T 


/  *  *  Begin  infinite  asynchronous  loop  *  *  / 
while  true  do 

for  each  cell  qi  E  Q  do 
if  N(qi)  /  0  then 

for  each  node  qj  E  N(qi)  do 
Query  Vq  and  x? 

/  *  *  Enabling  and  Disabling  transitions  for  Control  *  *  / 
/  *  *  Compare  measure  with  neighbors  and  allow  transitions 
only  to  the  better  neighbors  *  *  / 


then 


if  vJe  <  vL 

T,k-.qkel v(9i)  pcfe,  Sj,  *0  +  0  then 

Pete,  si,  0  =  Pg(?*,< 

,0  =  o 


,0+P Gm(H 

Pete, 

Trfe.Si)  =  7T  (qi,Si)  +  TTdurn(qi,Si) 


i,  0 


Algorithm  2  Updating  of  measure  vector  for  re-planning 

Require:  GNav(Q,  £,  <5,  tt*,  x),  0,  PG,  u* ,  |T| 

Ensure.  l*replan>  PG .replan  ’  ^replan 

1:  Update  x  with  new  information  about  obstacles 
Initialize  t/g  =  v*  &  n  =  1 

pdnm  =  p*  and  ndum  _  „.* 


i)  =  0 


r  (?i, 

end  if 

/  **  where,  5(qi,Si)  =  qt  &  S(qi,Sj )  =  qj  *  *  / 
/  *  *  Disabling  *  *  / 
else 

if  J2k-.qkGN(qi)  Pc(9i,  Sj,  k)  ==  0  then 

PG(ffi,*J-,:)=Pgum(«.*j.O 
PG(gi,si,:)  =  PG(9i,Si,:)-P^m(?i, 
7r(<2i,Si)  =  7r(qi,Si)  -  ndurn(qi,  Si) 
n(qi,Sj)  =  ndurn(qi,Sj) 

/  *  *  Enabling  *  *  / 
end  if 
end  if 


ydum  . 

G  —  x  G 

while  n  <  \T\  do 

Lines  4  through  24  of  Algorithm  1 
n  =  n  +  1 

end  while 


Si,  o 


^0  -  E k:qkeN(qi) 

/  *  *  Node  updating  *  *  / 

end  for 
end  if 
end  for 
end  while 


(1  -  0)Pg (9i,Sfc,  0^r(?i)  +  9  : 


time  horizon  of  re-planning  by  using  the  already  optimized 
stochastic  navigation  matrix  Pq.  However,  as  the  stochastic 
matrix  was  already  optimized  by  considering  the  effects  of 
disturbances,  the  re -planning  keeps  the  vehicle  clear  of  the 
obstacle  under  such  disturbances.  The  quality  of  adaptation 
is  expected  to  improve  with  an  increase  in  the  re-planning 
time  horizon.  Hence,  a  choice  of  the  re-planning  time-horizon 
(T)  depends  on  the  trade-off  between  the  computation  time 
and  quality  of  the  solution.  The  steps  involved  in  re-planning 
are  succinctly  presented  in  Algorithm  2,  which  efficiently 
computes  a  feasible  path.  This  is  accomplished  by  removing 


Remark  5.1  The  time  complexity  of  Algorithm  1  is  n  ~ 
0(\Q\\N(q)\2),  for  a  PFSA  with  |Q|  states  and  \N(q)\  is  the 
maximum  neighborhood  size  of  the  PFSA  states  [18],  The  time 
complexity  of  Algorithm  2  is  t 2  ~  0(|T||7V(q)|),  where  T  is 
the  finite-time  horizon  of  re-planning. 

6.  Interpretation  of  the  Results 

This  section  presents  pertinent  results  of  AUV  simulation 
for  the  proposed  method  of  path  planning  in  the  presence  of 
uncertainties  due  to  ocean  currents.  To  justify  the  usage  of 
a  disturbance-aware  path  planning  algorithm  for  AUVs  and 
to  demonstrate  the  algorithm’s  efficacy  to  react  to  different 
types  of  disturbances.  Fig.  2  shows  the  (simulated)  optimal 
paths  of  an  AUV  that  is  navigated  from  point  A  to  point  B 
in  a  similar  environment  with  three  different  directions  of  the 
ocean  current.  The  optimal  nominal  paths  in  these  three  cases 
are  significantly  different,  because  the  paths  are  calculated  to 
accommodate  the  disturbances  due  to  the  ocean  current. 

During  actual  navigation,  the  AUV  is  likely  to  deviate  from 
the  nominal  trajectory;  however,  it  finds  a  path  from  any 
arbitrary  state  to  the  terminal  state  by  following  the  gradient 
of  the  optimized  measure  vector.  The  nominal  trajectory  in 
each  of  the  three  plates  in  Fig.  2  shows  the  shortest  safe  path 
based  on  the  expected  deviations  due  to  the  disturbances. 

Figure  3  presents  a  scenario,  where  the  environment  is 
only  partially  known  a  priori  and  an  unforeseen  obstacle 
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Fig.  3.  Real-time  re-planning  over  a  finite  time  horizon  T. 


at  point  D  is  observed  when  the  AUV  reaches  point  C. 
Consequently,  the  original  plan  (i.e.,  without  the  knowledge  of 
the  obstacle  at  D )  is  altered  to  avoid  a  potential  collision.  In 
Fig.  3,  the  uncontrollable  transitions  (that  are  due  to  the  ocean 
current)  tend  to  push  the  vehicle  upstream  and  path  re-planning 
attempts  to  by-pass  the  obstacle  at  D\  however,  the  amount  of 
time  available  for  online  re-planning  is  limited.  Algorithm  2  is 
iteratively  executed  to  improve  the  re-planning  monotonically 
with  an  increase  in  the  span  \T\  of  time  horizon  T.  As  seen 
for  \T\  =  2,5,  and  10  in  Fig.  3,  the  respective  re-plans  are 
generated  to  navigate  the  AUV  further  down  so  that  even  if  the 
vehicle  is  drifted  upwards  by  the  ocean  current,  it  would  still 
be  able  to  avoid  a  collision  with  the  obstacle.  By  increasing 
the  re-planning  time  horizon  from  \T\  =  2  to  \T\  =  10, 
the  plan  becomes  more  robust  to  collision  as  it  attempts  to 
keep  the  vehicle  clear  of  the  obstacle  by  a  larger  distance 
by  optimizing  the  margin  of  error  due  to  the  ocean-current- 
induced  disturbances.  A  small  re -planning  time  horizon  results 
in  local  greedy  search  of  the  configuration  space;  it  limits 
the  effects  of  the  new  obstacle  to  a  narrow  region  around  it. 
A  smaller  re-planning  time  horizon  also  limits  the  expected 
deviation  of  the  vehicle  due  to  disturbances. 

A  global  optimal  policy  could  be  significantly  different  from 
a  local  greedy  solution  obtained  in  limited  time  as  seen  in 
Fig.  3,  where  the  the  re-planned  path  for  |T|  =  15  is  different 
from  those  for  \T\  =  2,  5,  and  10.  While  Algorithm  2  makes 
local  perturbations  in  the  initial  plan  to  keep  the  AUV  clear 
of  an  observed  obstacle  at  D,  a  sufficiently  large  re-planning 
time  horizon  (e.g.,  \T\  =  15  in  Fig.  3)  generates  an  optimal 
path  such  that  the  local  changes  in  the  re -planned  path  may 
finally  cause  convergence  to  a  global  optimal  path. 

As  the  original  optimal  stochastic  navigation  matrix  contains 
information  of  the  disturbance,  even  a  small  re-planning  time 
horizon  could  keep  the  AUV  clear  of  the  obstacle  in  the 
presence  of  disturbances;  however,  robustness  of  the  plans 
improve  monotonically  with  increase  in  the  re-planning  time 
horizon  parameter  |T|.  In  essence,  a  choice  of  |T|  depends 
on  the  in-situ  computational  capabilities  of  the  AUV  and  the 
time  available  for  online  updating  of  the  path  plan,  which  may 


depend  on  the  AUV’s  speed  and  observation  window. 

For  time-critical  operations  with  limited  computational  ca¬ 
pabilities,  a  goal  of  re-planning  is  to  ensure  safety  while 
maintaining  an  accepted  level  of  performance  (e.g.,  path 
length).  This  trade-off  could  be  achieved  in  the  language- 
measure-theoretic  framework  by  selecting  the  characteristic 
weight  of  new-obstacle  states  during  re-planning,  where  an 
increased  penalty  on  collision  with  an  obstacle  would  enhance 
safety  at  the  expense  of  a  longer  deviation  from  the  original 
plan.  This  is  achieved  by  having  the  characteristic  weights  of 
the  newly  observed  obstacles  states  as  K  *  Xobs >  where  the 
multiplicative  constant  K  is  a  positive  real  parameter,  while 
all  other  characteristic  weights  remain  unchanged.  Figure  4 
shows  the  effects  of  changing  the  characteristic  weight  vector 
for  the  case  |T|  =  2  in  Fig.  3.  As  seen  in  Fig.  4,  the  safety 
margin  is  enhanced  by  increasing  K,  which  is  capable  of 
accommodating  larger  anomalous  behavior  of  the  robot  in  the 
presence  of  ocean  currents  at  the  expense  of  an  increased  path 
length.  For  a  family  of  admissible  values  of  K,  the  convex  hull 
of  the  generated  solutions  would  lead  to  a  Pareto-optimal  front 
representing  tradeoff  between  safety  and  performance.  For  any 
particular  real-time  operation,  the  choice  of  the  parameter  K 
is  dependent  on  computational  capabilities  and  time  available 
for  re-planning  as  well  as  strength  of  disturbances. 
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Fig.  4.  Re-planning  with  \T\  =  2  for  different  characteristic  weights. 
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7.  Summary,  Conclusions  and  Future  Work 

This  paper  presents  a  generalized  framework  for  robust 
path  planning  of  AUVs  in  the  presence  of  environmental 
disturbances  and  actuation  errors.  Concepts  of  recently  devel¬ 
oped  language  measure -based  optimization  [  10]  [  1 1]  have  been 
applied  to  demonstrate  successful  navigation  of  an  autonomous 
robot  moving  in  an  uncertain  and  partially  known  environment. 
The  algorithm  is  robust  to  both  modeling  and  environmental 
uncertainties;  however,  these  uncertainties  must  be  bounded. 
The  efficacy  of  the  proposed  concept  is  demonstrated  by 
numerical  simulation  on  a  test  bed  of  an  AUV  moving  in 
the  presence  of  ocean  currents  and  unforeseen  obstacles.  The 
paper  is  summarized  below. 
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1)  It  presents  a  path  planning  philosophy  that  is  fundamen¬ 
tally  different  from  the  ones  reported  in  literature. 

2)  It  performs  online  adaptation  of  the  planning  algorithms 
to  unanticipated  obstacles  in  the  presence  of  distur¬ 
bances. 

Many  path  planning  algorithms,  reported  in  the  current  litera¬ 
ture,  may  not  be  able  to  efficiently  handle  collision  avoidance 
in  real  time  in  the  presence  of  disturbances,  primarily  due 
to  overwhelming  computational  requirements.  It  is  shown  in 
this  paper  through  numerical  simulations  that  the  language 
measure-theoretic  algorithm  can  handle  unanticipated  obsta¬ 
cles  in  real  time  in  the  presence  of  disturbances.  However, 
actual  performance  comparisons  with  state-of-the-art  planning 
is  yet  to  be  demonstrated  by  experimental  validation. 

While  there  are  numerous  research  areas  for  robot  path 
planning  based  on  language-measure-theoretic  concepts,  the 
following  topics  are  recommended  for  future  research. 

1)  Construction  of  a  biased  sampling  scheme  while  up¬ 
dating  the  language  measure  of  nodes  in  a  distributed 
fashion  to  enhance  the  quality  of  path  planning  and  to 
mitigate  the  computational  cost.  Such  a  scheme  is  impor¬ 
tant  for  a  trade-off  between  exploration  and  exploitation. 

2)  Extension  of  the  proposed  method  of  robot  path  planning 
under  time  constraints  as  well  as  by  minimizing  the 
energy  consumption. 

3)  Validation  of  the  proposed  method  by  laboratory  exper¬ 
imentation  on  a  networked  robotic  test  bed  with  non- 
holonomic  constraints  [8]. 

4)  Performance  comparison  with  other  methods  like  those 
presented  in  [5] [6] [7], 
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